286 research outputs found

    Metric learning pairwise kernel for graph inference

    Full text link
    Much recent work in bioinformatics has focused on the inference of various types of biological networks, representing gene regulation, metabolic processes, protein-protein interactions, etc. A common setting involves inferring network edges in a supervised fashion from a set of high-confidence edges, possibly characterized by multiple, heterogeneous data sets (protein sequence, gene expression, etc.). Here, we distinguish between two modes of inference in this setting: direct inference based upon similarities between nodes joined by an edge, and indirect inference based upon similarities between one pair of nodes and another pair of nodes. We propose a supervised approach for the direct case by translating it into a distance metric learning problem. A relaxation of the resulting convex optimization problem leads to the support vector machine (SVM) algorithm with a particular kernel for pairs, which we call the metric learning pairwise kernel (MLPK). We demonstrate, using several real biological networks, that this direct approach often improves upon the state-of-the-art SVM for indirect inference with the tensor product pairwise kernel

    Inferring Diploid 3D Chromatin Structures from Hi-C Data

    Get PDF
    The 3D organization of the genome plays a key role in many cellular processes, such as gene regulation, differentiation, and replication. Assays like Hi-C measure DNA-DNA contacts in a high-throughput fashion, and inferring accurate 3D models of chromosomes can yield insights hidden in the raw data. For example, structural inference can account for noise in the data, disambiguate the distinct structures of homologous chromosomes, orient genomic regions relative to nuclear landmarks, and serve as a framework for integrating other data types. Although many methods exist to infer the 3D structure of haploid genomes, inferring a diploid structure from Hi-C data is still an open problem. Indeed, the diploid case is very challenging, because Hi-C data typically does not distinguish between homologous chromosomes. We propose a method to infer 3D diploid genomes from Hi-C data. We demonstrate the accuracy of the method on simulated data, and we also use the method to infer 3D structures for mouse chromosome X, confirming that the active homolog exhibits a bipartite structure, whereas the active homolog does not

    Jointly Embedding Multiple Single-Cell Omics Measurements

    Get PDF
    Many single-cell sequencing technologies are now available, but it is still difficult to apply multiple sequencing technologies to the same single cell. In this paper, we propose an unsupervised manifold alignment algorithm, MMD-MA, for integrating multiple measurements carried out on disjoint aliquots of a given population of cells. Effectively, MMD-MA performs an in silico co-assay by embedding cells measured in different ways into a learned latent space. In the MMD-MA algorithm, single-cell data points from multiple domains are aligned by optimizing an objective function with three components: (1) a maximum mean discrepancy (MMD) term to encourage the differently measured points to have similar distributions in the latent space, (2) a distortion term to preserve the structure of the data between the input space and the latent space, and (3) a penalty term to avoid collapse to a trivial solution. Notably, MMD-MA does not require any correspondence information across data modalities, either between the cells or between the features. Furthermore, MMD-MA\u27s weak distributional requirements for the domains to be aligned allow the algorithm to integrate heterogeneous types of single cell measures, such as gene expression, DNA accessibility, chromatin organization, methylation, and imaging data. We demonstrate the utility of MMD-MA in simulation experiments and using a real data set involving single-cell gene expression and methylation data

    Comparison of Patient and Surgeon Expectations of Total Hip Arthroplasty

    Get PDF
    OBJECTIVES: Analysis of discrepancies between patient and surgeon expectations before total hip arthroplasty (THA) should enable a better understanding of motives of dissatisfaction about surgery, but this question has been seldom studied. Our objectives were to compare surgeons' and patients' expectations before THA, and to study factors which affected surgeon-patient agreement. METHODS: 132 adults (mean age 62.8+/-13.7 years, 52% men) on waiting list for THA in three tertiary care centres and their 16 surgeons were interviewed to assess their expectations using the Hospital for Special Surgery Total Hip Replacement Expectations Survey (range 0-100). Patients' and surgeons' answers were compared, for the total score and for the score of each item. Univariate analyses tested the effect of patients' characteristics on surgeons' and patients' expectations separately, and on surgeon-patient differences. RESULTS: Surgeon and patient expectations' mean scores were high (respectively 90.9+/-11.1 and 90.0+/-11.6 over 100). Surgeons' and patients' expectations showed no systematic difference, but there was little agreement on Bland and Altman graph and correlation coefficient was low. Patients had higher expectations than surgeons for sports. Patients rated their expectations according to trust in physician and mental quality of life, surgeons considered disability. More disabled patients and patients from a low-income professional category were often "more optimistic" than their surgeons. CONCLUSION: Surgeons and patients often do not agree on what to expect from THA. More disabled patients expect better outcomes than their surgeons

    A new pairwise kernel for biological network inference with support vector machines

    Get PDF
    International audienceBACKGROUND: Much recent work in bioinformatics has focused on the inference of various types of biological networks, representing gene regulation, metabolic processes, protein-protein interactions, etc. A common setting involves inferring network edges in a supervised fashion from a set of high-confidence edges, possibly characterized by multiple, heterogeneous data sets (protein sequence, gene expression, etc.). RESULTS: Here, we distinguish between two modes of inference in this setting: direct inference based upon similarities between nodes joined by an edge, and indirect inference based upon similarities between one pair of nodes and another pair of nodes. We propose a supervised approach for the direct case by translating it into a distance metric learning problem. A relaxation of the resulting convex optimization problem leads to the support vector machine (SVM) algorithm with a particular kernel for pairs, which we call the metric learning pairwise kernel. This new kernel for pairs can easily be used by most SVM implementations to solve problems of supervised classification and inference of pairwise relationships from heterogeneous data. We demonstrate, using several real biological networks and genomic datasets, that this approach often improves upon the state-of-the-art SVM for indirect inference with another pairwise kernel, and that the combination of both kernels always improves upon each individual kernel. CONCLUSION: The metric learning pairwise kernel is a new formulation to infer pairwise relationships with SVM, which provides state-of-the-art results for the inference of several biological networks from heterogeneous genomic data

    Knee Arthroplasty: Disabilities in Comparison to the General Population and to Hip Arthroplasty Using a French National Longitudinal Survey

    Get PDF
    International audienceBACKGROUND: Knee arthroplasty is increasing exponentially due to the aging of the population and to the broadening of indications. We aimed to compare physical disability and its evolution over two years in people with knee arthroplasty to that in the general population. A secondary objective was to compare the level of disabilities of people with knee to people with hip arthroplasty. METHODOLOGY/PRINCIPAL FINDINGS: 16,945 people representative of the French population were selected in 1999 from the French census and interviewed about their level of disability. This sample included 815 people with lower limb arthroplasty. In 2001, 608 of them were re-interviewed, among whom 134 had knee arthroplasty. Among the other participants re-interviewed, we identified 68 who had undergone knee arthroplasty and 145 hip arthroplasty within the last two years (recent arthroplasty). People with knee arthroplasty reported significantly greater difficulties than the general population with bending forward (odds ratio [OR] = 4.7; 95% confidence interval [CI]: 1.7, 12.6), walking more than 500 meters (OR = 6.0; 95% CI: 1.5, 24.7) and carrying 5 kg kilograms for 10 meters (OR = 4.6; 95% CI: 1.3, 16.4). However, the two years evolution in disability was similar to that in the general population for most activities. The level of mobility was similar between people with recent knee arthroplasty and those with recent hip arthroplasty. Nevertheless, people with recent knee arthroplasty reported a lower level of disability than the other group for washing and bending forward (OR = 0.3; 95% CI: 0.1, 0.6 and OR = 0.4; 95% CI: 0.1, 0.9, respectively). CONCLUSIONS/SIGNIFICANCE: People with knee arthroplasty reported a higher risk of disability than the general population for common activities of daily living but a similar evolution. There was no relevant difference between recent knee and hip arthroplasties for mobility

    Comparing HLA Shared Epitopes in French Caucasian Patients with Scleroderma

    Get PDF
    Although many studies have analyzed HLA allele frequencies in several ethnic groups in patients with scleroderma (SSc), none has been done in French Caucasian patients and none has evaluated which one of the common amino acid sequences, 67FLEDR71, shared by HLA-DRB susceptibility alleles, or 71TRAELDT77, shared by HLA-DQB1 susceptibility alleles in SSc, was the most important to develop the disease. HLA-DRB and DQB typing was performed for a total of 468 healthy controls and 282 patients with SSc allowing FLEDR and TRAELDT analyses. Results were stratified according to patient’s clinical subtypes and autoantibody status. Moreover, standardized HLA-DRß1 and DRß5 reverse transcriptase Taqman PCR assays were developed to quantify ß1 and ß5 mRNA in 20 subjects with HLA-DRB1*15 and/or DRB1*11 haplotypes. FLEDR motif is highly associated with diffuse SSc (χ2 = 28.4, p<10−6) and with anti-topoisomerase antibody (ATA) production (χ2 = 43.9, p<10−9) whereas TRAELDT association is weaker in both subgroups (χ2 = 7.2, p = 0.027 and χ2 = 14.6, p = 0.0007 respectively). Moreover, FLEDR motif- association among patients with diffuse SSc remains significant only in ATA subgroup. The risk to develop ATA positive SSc is higher with double dose FLEDR than single dose with respectively, adjusted standardised residuals of 5.1 and 2.6. The increase in FLEDR motif is mostly due to the higher frequency of HLA-DRB1*11 and DRB1*15 haplotypes. Furthermore, FLEDR is always carried by the most abundantly expressed ß chain: ß1 in HLA DRB1*11 haplotypes and ß5 in HLA-DRB1*15 haplotypes
    corecore